-
Notifications
You must be signed in to change notification settings - Fork 350
schedule: add support for user-space LL scheduler #10508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds initial support for running SOF Low-Latency (LL) scheduler tasks in Zephyr user-space, providing memory protection and isolation between audio code and kernel resources.
Changes:
- Adds user-space LL scheduler support with dedicated memory domains and heap management
- Replaces spinlocks with mutexes for user-space compatibility
- Introduces test case to validate LL task creation and execution in user-space mode
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| zephyr/test/userspace/test_ll_task.c | New test case validating user-space LL scheduler functionality with task lifecycle management |
| zephyr/test/userspace/README.md | Documentation update describing new LL scheduler test |
| zephyr/test/CMakeLists.txt | Build configuration to include LL task test when CONFIG_SOF_USERSPACE_LL is enabled |
| zephyr/Kconfig | New CONFIG_SOF_USERSPACE_LL option for enabling user-space LL pipelines |
| src/schedule/zephyr_ll.c | Core LL scheduler implementation modified to support user-space execution with dynamic memory allocation |
| src/schedule/zephyr_domain.c | Domain thread management updated for user-space with mutex-based synchronization |
| src/schedule/Kconfig | Statistics logging disabled for user-space LL scheduler |
| src/init/init.c | Initialization hook for user-space LL resources |
| src/include/sof/schedule/ll_schedule_domain.h | Header updates exposing user-space LL APIs and mutex-based locking |
| src/include/sof/schedule/ll_schedule.h | API declarations for user-space LL heap and memory domain management |
| src/debug/telemetry/Kconfig | Telemetry disabled when user-space LL is enabled |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/schedule/zephyr_domain.c
Outdated
| ll_sch_domain_set_pdata(domain, zephyr_domain); | ||
|
|
||
| struct zephyr_domain_thread *dt = zephyr_domain->domain_thread + cpu_get_id(); |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable declaration should be at the beginning of the function or block. Move this declaration to the top of the function for consistency with C89/C90 style if required by the project, or to improve readability.
| ll_sch_domain_set_pdata(domain, zephyr_domain); | |
| struct zephyr_domain_thread *dt = zephyr_domain->domain_thread + cpu_get_id(); | |
| struct zephyr_domain_thread *dt; | |
| ll_sch_domain_set_pdata(domain, zephyr_domain); | |
| dt = zephyr_domain->domain_thread + cpu_get_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zephyr and SOF allow C99 usage: https://docs.zephyrproject.org/latest/develop/languages/c/index.html
|
Example test run (on Intel PTL): |
| #define schedule_task_init_ll zephyr_ll_task_init | ||
|
|
||
| struct task *zephyr_ll_task_alloc(void); | ||
| k_tid_t zephyr_ll_get_thread(int core); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I think we use struct k_thread * mostly in SOF and it seems to "work better" with various simulation / testing builds, I was getting "undefined" errors when I tried to use k_tid_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh Seems we have a mix in existing code as well. I will switch over a few places, but I won't start changing existing code away from k_tid_t in this PR.
|
|
||
| #if defined(CONFIG_SOF_USERSPACE_LL) | ||
| /* Allocate mutex dynamically for userspace access */ | ||
| domain->lock = k_object_alloc(K_OBJ_MUTEX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this allocates cached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok checked now. With this PR, these allocs come from system heap, and will be uncached. So this is good, but I need to keep a close tab on this.
src/schedule/zephyr_domain.c
Outdated
|
|
||
| #if CONFIG_SOF_USERSPACE_LL | ||
|
|
||
| k_tid_t zephyr_domain_thread_tid(struct ll_schedule_domain *domain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct k_thread * maybe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a clearly a new addition, so let me change this in V2.
zephyr/test/userspace/test_ll_task.c
Outdated
| ZTEST(userspace_ll, ll_task_test) | ||
| { | ||
| ll_task_test(); | ||
| ztest_test_pass(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually removed these from my tests, they're doing some long jumps... Are you sure you need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just following existing tests, but you are right, probably should remove.
I think this should be only used in specific case like: "" * However, if the success case for your test involves a fatal fault, you can call this function from k_sys_fatal_error_handler to indicate that the test passed before aborting the thread.""
|
|
||
| if (CONFIG_SOF_BOOT_TEST_STANDALONE AND CONFIG_SOF_USERSPACE_LL) | ||
| if(CONFIG_SOF_BOOT_TEST_STANDALONE AND CONFIG_SOF_USERSPACE_LL) | ||
| zephyr_library_sources(userspace/test_ll_task.c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like aligning to use TABs instead would make the path smaller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack - copilot align to tab 8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really care either way, but this 2-space indent is the style used in both upstream Cmake and upstream Zephyr, so given the mix-of-style we currently have, I think we should just go with this.
softwarecki
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First quick remarks. I still have 2 commits left to review... There is a lot of conditional code added here. Would it not be better to make this a separate scheduler? SOF already supports multiple different schedulers
| #endif /* CONFIG_SOF_USERSPACE_LL */ | ||
|
|
||
| struct zephyr_domain_thread { | ||
| struct k_thread ll_thread; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep these static objects when CONFIG_SOF_USERSPACE_LL is not enabled? We could keep pointer fields that point to static objects, similar to the dp scheduler solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I went back and forth with this a bit. This is a bit messy to have both, but I do agree there's is value to not touch the current implementation (and keep the objects static). Let my try this in V2 and see how the code looks to you all.
src/schedule/zephyr_domain.c
Outdated
| (void *)mem_partition.start, | ||
| heap->heap.init_bytes); | ||
|
|
||
| mem_partition.start = (uintptr_t)sys_cache_uncached_ptr_get(heap->heap.init_mem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zephyr maps cached and non-cached addresses when the double map config is enabled. Maybe it is worth check it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no double mapping any more.
src/schedule/zephyr_domain.c
Outdated
| zephyr_domain->timer = k_object_alloc(K_OBJ_TIMER); | ||
| if (!zephyr_domain->timer) { | ||
| tr_err(&ll_tr, "timer allocation failed"); | ||
| rfree(zephyr_domain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heap_free(zephyr_ll_heap(), ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed in V2.
The telemetry infra is calling privileged timer functions, so if the Low-Latency tasks are run in user-space, telemetry must be disabled. Signed-off-by: Kai Vehmanen <[email protected]>
The load tracking for Low-Latency tasks depends on low-overhead access to cycle counter (e.g. CCOUNT on xtensa), which is not currently available from user-space tasks. Add a dependency to ensure the LL stats can only be enabled if LL tasks are run in kernel mode. Signed-off-by: Kai Vehmanen <[email protected]>
jsarha
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through this once, but did not pick really almost anything that was not picked before. Still many places I do not fully understand. I gather there is a new version coming. I go through this again when its out.
src/schedule/zephyr_domain.c
Outdated
| k_thread_abort(&zephyr_domain->domain_thread[core].ll_thread); | ||
| if (zephyr_domain->domain_thread[core].ll_thread) { | ||
| k_thread_abort(zephyr_domain->domain_thread[core].ll_thread); | ||
| k_object_free(zephyr_domain->domain_thread[core].ll_thread); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be:
#ifdef CONFIG_SOF_USERSPACE_LL
k_object_free(zephyr_domain->domain_thread[core].ll_thread);
else
rfree(zephyr_domain->domain_thread[core].ll_thread);
#endif
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only some minor things from me.
| config SOF_TELEMETRY | ||
| bool "enable telemetry" | ||
| default n | ||
| depends on !SOF_USERSPACE_LL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume here we still need to map this page with timer IO for user as RO ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood Not sure I understand connection with timer IO. Telemetry does direct writes to the debug window in shared memory and these are writes are down from various call sites in SOF codebase. Mapping the debug window to ll thread is one option, but given this is optional (and not really used in Linux), this is something we can easily tackle later if needed.
| if (!IS_ENABLED(CONFIG_SOF_USERSPACE_LL) || !dt->ll_thread) { | ||
| /* Allocate thread structure dynamically */ | ||
| #if CONFIG_SOF_USERSPACE_LL | ||
| dt->ll_thread = k_object_alloc(K_OBJ_THREAD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would an object not equally work for LL kernel ? i.e. do we always need to differentiate here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood That's an interesting point. I actually started moving all builds to use k_object_alloc(), but then realized this interface is not available if CONFIG_USERSPACE is not set. So e.g. with Intel PTL, we could use this for all builds, but in build targets where userspace is not used, it won't work. So at least for now, I think we need to differentiate, but there is certainly potential to converge.
The main technical need is to register the objects to Zephyr kernel object database. We need this so we can grant access to the object to user threads.
| struct ll_schedule_domain *ll_domain; /* scheduling domain */ | ||
| unsigned int core; /* core ID of this instance */ | ||
| #if CONFIG_SOF_USERSPACE_LL | ||
| struct k_mutex *lock; /* mutex for userspace */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question re differentiation, would a mutex work here too for a kernel thread in this use case ? Not a blocker or anything, it would be nice at some point to merge some of these flows around locking since at the end of the day we are using threads in both kernel/user mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack @lgirdwood , I do think there is potential to converge much more. We are running both IPC handling and the LL tasks in threads (not in ISRs), so I don't really think we need to use spinlocks even in kernel builds. So in theory we should be able to use same locking code and when run in user-space, the lock/unlock calls are just trapped as system calls. But now, as LL user is still a developing feature, I want to start keeping modifications to current LL kernel scheduler to a minimum. So continuing on this track with V2.
|
|
||
| if (CONFIG_SOF_BOOT_TEST_STANDALONE AND CONFIG_SOF_USERSPACE_LL) | ||
| if(CONFIG_SOF_BOOT_TEST_STANDALONE AND CONFIG_SOF_USERSPACE_LL) | ||
| zephyr_library_sources(userspace/test_ll_task.c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack - copilot align to tab 8
kv2019i
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot of the reviews! I answer most comment inline, but no V2 uploaded. I still have a few comments from @softwarecki and @lyakh I need to cover before uploading V2.
@softwarecki I did look at the option to move code to a separate scheduler file. Especially in zephyr_domain.c, this would bring benefit can keep code readability. OTOH, most of the code is still shared and it does look possible we can converge the kernel/user implementations more down the road. I did now implement a bit of a compromise solution where I split out some user-ll specific functions to a separate file, and implemented separate domain register/unregister functions for user/kernel builds. This will make it easier to see I'm not modifying the existing default kernel LL implementation, while still reusing most of the common code. I'll tidy up opens tomorrow and push a V2 for comments.
| #define scheduler_init_ll zephyr_ll_scheduler_init | ||
| #define schedule_task_init_ll zephyr_ll_task_init | ||
|
|
||
| struct task *zephyr_ll_task_alloc(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, will fix in V2.
| #define schedule_task_init_ll zephyr_ll_task_init | ||
|
|
||
| struct task *zephyr_ll_task_alloc(void); | ||
| k_tid_t zephyr_ll_get_thread(int core); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh Seems we have a mix in existing code as well. I will switch over a few places, but I won't start changing existing code away from k_tid_t in this PR.
|
|
||
| #if defined(CONFIG_SOF_USERSPACE_LL) | ||
| /* Allocate mutex dynamically for userspace access */ | ||
| domain->lock = k_object_alloc(K_OBJ_MUTEX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok checked now. With this PR, these allocs come from system heap, and will be uncached. So this is good, but I need to keep a close tab on this.
|
|
||
| if (CONFIG_SOF_BOOT_TEST_STANDALONE AND CONFIG_SOF_USERSPACE_LL) | ||
| if(CONFIG_SOF_BOOT_TEST_STANDALONE AND CONFIG_SOF_USERSPACE_LL) | ||
| zephyr_library_sources(userspace/test_ll_task.c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really care either way, but this 2-space indent is the style used in both upstream Cmake and upstream Zephyr, so given the mix-of-style we currently have, I think we should just go with this.
zephyr/test/userspace/test_ll_task.c
Outdated
| ZTEST(userspace_ll, ll_task_test) | ||
| { | ||
| ll_task_test(); | ||
| ztest_test_pass(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just following existing tests, but you are right, probably should remove.
I think this should be only used in specific case like: "" * However, if the success case for your test involves a fatal fault, you can call this function from k_sys_fatal_error_handler to indicate that the test passed before aborting the thread.""
Add option to build SOF with support for running LL scheduler in user-space. This commit adds initial support in the scheduler and does not yet allow to run full SOF application using the new scheduler configuration, but has enough functionality to run scheduler level tests. No functional change to default build configuration where LL scheduler is run in kernel mode, or to platforms with no userspace support. Signed-off-by: Kai Vehmanen <[email protected]>
Add a test case to run tasks with low-latency (LL) scheduler in user-space. The test does not yet use any audio pipeline functionality, but uses similar interfaces towards the SOF scheduler interface. Signed-off-by: Kai Vehmanen <[email protected]>
There are multiple style variants used in SOF for CMakeLists.txt, but this file now contains multiple variants in the same file. Fix this and align style to Zephyr style (2 space for indent, no tabs, no space before opening brackets). Signed-off-by: Kai Vehmanen <[email protected]>
e57df1a to
b91a460
Compare
|
V2 pushed:
|
kv2019i
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional inline replies to comments.
| config SOF_TELEMETRY | ||
| bool "enable telemetry" | ||
| default n | ||
| depends on !SOF_USERSPACE_LL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood Not sure I understand connection with timer IO. Telemetry does direct writes to the debug window in shared memory and these are writes are down from various call sites in SOF codebase. Mapping the debug window to ll thread is one option, but given this is optional (and not really used in Linux), this is something we can easily tackle later if needed.
| if (!IS_ENABLED(CONFIG_SOF_USERSPACE_LL) || !dt->ll_thread) { | ||
| /* Allocate thread structure dynamically */ | ||
| #if CONFIG_SOF_USERSPACE_LL | ||
| dt->ll_thread = k_object_alloc(K_OBJ_THREAD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood That's an interesting point. I actually started moving all builds to use k_object_alloc(), but then realized this interface is not available if CONFIG_USERSPACE is not set. So e.g. with Intel PTL, we could use this for all builds, but in build targets where userspace is not used, it won't work. So at least for now, I think we need to differentiate, but there is certainly potential to converge.
The main technical need is to register the objects to Zephyr kernel object database. We need this so we can grant access to the object to user threads.
| struct ll_schedule_domain *ll_domain; /* scheduling domain */ | ||
| unsigned int core; /* core ID of this instance */ | ||
| #if CONFIG_SOF_USERSPACE_LL | ||
| struct k_mutex *lock; /* mutex for userspace */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack @lgirdwood , I do think there is potential to converge much more. We are running both IPC handling and the LL tasks in threads (not in ISRs), so I don't really think we need to use spinlocks even in kernel builds. So in theory we should be able to use same locking code and when run in user-space, the lock/unlock calls are just trapped as system calls. But now, as LL user is still a developing feature, I want to start keeping modifications to current LL kernel scheduler to a minimum. So continuing on this track with V2.
| #ifndef CONFIG_SOF_USERSPACE_LL | ||
| /* TODO: what to do with notifiers? */ | ||
| notifier_event(sch, NOTIFIER_ID_LL_POST_RUN, | ||
| NOTIFIER_TARGET_CORE_LOCAL, NULL, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack @lyakh , added a TODO entry for now in V2 set.
Series that starts building up support for running SOF LL tasks in user-space (on platforms supporting Zephyr user-space). We already have support for DP tasks, so with both LL and DP supported, in theory all audio can be moved to user-space and run in separate memory space. This will isolate audio code from direct hardware access, protect kernel memory and device driver state.
This PR contains initial support for LL scheduler and adds a separate test case to mimic usage of SOF audio pipeline, without yet bringing in any audio dependencies.